multi-head attention
multi-head attention
https://gyazo.com/7e9cf764869843934cc93708186da2cf
self-attention(scaled dot-product attention)を並列に複数並べたもの
self-attentionの出力をconcatして全結合層へ送る
#Transformer
#BERT